Search CORE

113 research outputs found

Exploring compression techniques for ROOT IO

Author: Bockelman Brian
Zhang Zhe
Publication venue: 'IOP Publishing'
Publication date: 01/01/2017
Field of study

ROOT provides an flexible format used throughout the HEP community. The number of use cases - from an archival data format to end-stage analysis - has required a number of tradeoffs to be exposed to the user. For example, a high "compression level" in the traditional DEFLATE algorithm will result in a smaller file (saving disk space) at the cost of slower decompression (costing CPU time when read). At the scale of the LHC experiment, poor design choices can result in terabytes of wasted space or wasted CPU time. We explore and attempt to quantify some of these tradeoffs. Specifically, we explore: the use of alternate compressing algorithms to optimize for read performance; an alternate method of compressing individual events to allow efficient random access; and a new approach to whole-file compression. Quantitative results are given, as well as guidance on how to make compression decisions for different use cases.Comment: Proceedings for 22nd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2016

arXiv.org e-Print Archive

DigitalCommons@University of Nebraska

Continuous Performance Benchmarking Framework for ROOT

Author: Bockelman Brian Paul
Shadura Oksana
Vassilev Vassil
Publication venue: 'EDP Sciences'
Publication date: 01/01/2019
Field of study

Foundational software libraries such as ROOT are under intense pressure to avoid software regression, including performance regressions. Continuous performance benchmarking, as a part of continuous integration and other code quality testing, is an industry best-practice to understand how the performance of a software product evolves over time. We present a framework, built from industry best practices and tools, to help to understand ROOT code performance and monitor the efficiency of the code for a several processor architectures. It additionally allows historical performance measurements for ROOT I/O, vectorization and parallelization sub-systems.Comment: 8 pages, 5 figures, CHEP 2018 - 23rd International Conference on Computing in High Energy and Nuclear Physic

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals

Discovering Job Preemptions in the Open Science Grid

Author: Bockelman Brian
Swanson David
Weitzel Derek
Zhang Zhe
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/07/2018
Field of study

The Open Science Grid(OSG) is a world-wide computing system which facilitates distributed computing for scientific research. It can distribute a computationally intensive job to geo-distributed clusters and process job's tasks in parallel. For compute clusters on the OSG, physical resources may be shared between OSG and cluster's local user-submitted jobs, with local jobs preempting OSG-based ones. As a result, job preemptions occur frequently in OSG, sometimes significantly delaying job completion time. We have collected job data from OSG over a period of more than 80 days. We present an analysis of the data, characterizing the preemption patterns and different types of jobs. Based on observations, we have grouped OSG jobs into 5 categories and analyze the runtime statistics for each category. we further choose different statistical distributions to estimate probability density function of job runtime for different classes.Comment: 8 page

arXiv.org e-Print Archive

Crossref

Extending ROOT through Modules

Author: Bockelman Brian Paul
Shadura Oksana
Vassilev Vassil
Publication venue: 'EDP Sciences'
Publication date: 11/12/2018
Field of study

The ROOT software framework is foundational for the HEP ecosystem, providing capabilities such as IO, a C++ interpreter, GUI, and math libraries. It uses object-oriented concepts and build-time components to layer between them. We believe additional layering formalisms will benefit ROOT and its users. We present the modularization strategy for ROOT which aims to formalize the description of existing source components, making available the dependencies and other metadata externally from the build system, and allow post-install additions of functionality in the runtime environment. components can then be grouped into packages, installable from external repositories to deliver post-install step of missing packages. This provides a mechanism for the wider software ecosystem to interact with a minimalistic install. Reducing intra-component dependencies improves maintainability and code hygiene. We believe helping maintain the smallest "base install" possible will help embedding use cases. The modularization effort draws inspiration from the Java, Python, and Swift ecosystems. Keeping aligned with the modern C++, this strategy relies on forthcoming features such as C++ modules. We hope formalizing the component layer will provide simpler ROOT installs, improve extensibility, and decrease the complexity of embedding in other ecosystemsComment: 8 pages, 2 figures, 1 listing, CHEP 2018 - 23rd International Conference on Computing in High Energy and Nuclear Physic

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals

Designing Computing System Architecture and Models for the HL-LHC era

Author: Bauerdick Lothar
Bockelman Brian
Elmer Peter
Gowdy Stephen
Tadel Matevz
Wuerthwein Frank
Publication venue: 'IOP Publishing'
Publication date: 20/07/2015
Field of study

This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade.Comment: Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japa

arXiv.org e-Print Archive

CERN Document Server

Data Access for LIGO on the OSG

Author: Bockelman Brian
Brown Duncan A.
Couvares Peter
Hernandez Edgar Fajardo
Weitzel Derek
Würthwein Frank
Publication venue
Publication date: 17/05/2017
Field of study

During 2015 and 2016, the Laser Interferometer Gravitational-Wave Observatory (LIGO) conducted a three-month observing campaign. These observations delivered the first direct detection of gravitational waves from binary black hole mergers. To search for these signals, the LIGO Scientific Collaboration uses the PyCBC search pipeline. To deliver science results in a timely manner, LIGO collaborated with the Open Science Grid (OSG) to distribute the required computation across a series of dedicated, opportunistic, and allocated resources. To deliver the petabytes necessary for such a large-scale computation, our team deployed a distributed data access infrastructure based on the XRootD server suite and the CernVM File System (CVMFS). This data access strategy grew from simply accessing remote storage to a POSIX-based interface underpinned by distributed, secure caches across the OSG.Comment: 6 pages, 3 figures, submitted to PEARC1

arXiv.org e-Print Archive

Caltech Authors

Long Term Dynamics for Two Three-Species Food Webs

Author: Bockelman Brian
Green Elizabeth
Lippitt Leslie
Sherman Jason
Publication venue: Rose-Hulman Scholar
Publication date: 15/01/2017
Field of study

In this paper, we analyze two possible scenarios for food webs with two prey and one predator (a food web is similar to a food chain except that in a web we have more than one species at some levels). In neither scenario do the prey compete, rather the scenarios differ in the selection method used by the predator. We determine how the dynamics depend on various parameter values. For some parameter values, one or more species dies out. For other parameter values, all species co-exist at equilibrium. For still other parameter values, the populations behave cyclically. We have even discovered parameter values for which the system exhibits chaos and has a positive Lyapunov exponent. Our analysis relies on common techniques such as nullcline analysis, equilibrium analysis and singular perturbation analysis

Rose-Hulman Institute of Technology: Rose-Hulman Scholar

SciTokens: Capability-Based Secure Access to Remote Scientific Data

Author: Basney Jim
Bockelman Brian
Brown Duncan
Gaynor Jeff
Miller Zach
Tannenbaum Todd
Weitzel Derek
Withers Alex
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/07/2018
Field of study

The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. In this paper, we introduce SciTokens, open source software to help scientists manage their security credentials more reliably and securely. We describe the SciTokens system architecture, design, and implementation addressing use cases from the Laser Interferometer Gravitational-Wave Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey Telescope (LSST) projects. We also present our integration with widely-used software that supports distributed scientific computing, including HTCondor, CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for capability-based secure access to remote scientific data. The access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems.Comment: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

arXiv.org e-Print Archive

Crossref